17 resultados para CHD Prediction, Blood Serum Data Chemometrics Methods

em DigitalCommons@The Texas Medical Center


Relevância:

100.00% 100.00%

Publicador:

Resumo:

Accurate quantitative estimation of exposure using retrospective data has been one of the most challenging tasks in the exposure assessment field. To improve these estimates, some models have been developed using published exposure databases with their corresponding exposure determinants. These models are designed to be applied to reported exposure determinants obtained from study subjects or exposure levels assigned by an industrial hygienist, so quantitative exposure estimates can be obtained. ^ In an effort to improve the prediction accuracy and generalizability of these models, and taking into account that the limitations encountered in previous studies might be due to limitations in the applicability of traditional statistical methods and concepts, the use of computer science- derived data analysis methods, predominantly machine learning approaches, were proposed and explored in this study. ^ The goal of this study was to develop a set of models using decision trees/ensemble and neural networks methods to predict occupational outcomes based on literature-derived databases, and compare, using cross-validation and data splitting techniques, the resulting prediction capacity to that of traditional regression models. Two cases were addressed: the categorical case, where the exposure level was measured as an exposure rating following the American Industrial Hygiene Association guidelines and the continuous case, where the result of the exposure is expressed as a concentration value. Previously developed literature-based exposure databases for 1,1,1 trichloroethane, methylene dichloride and, trichloroethylene were used. ^ When compared to regression estimations, results showed better accuracy of decision trees/ensemble techniques for the categorical case while neural networks were better for estimation of continuous exposure values. Overrepresentation of classes and overfitting were the main causes for poor neural network performance and accuracy. Estimations based on literature-based databases using machine learning techniques might provide an advantage when they are applied to other methodologies that combine `expert inputs' with current exposure measurements, like the Bayesian Decision Analysis tool. The use of machine learning techniques to more accurately estimate exposures from literature-based exposure databases might represent the starting point for the independence from the expert judgment.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Increasing amounts of clinical research data are collected by manual data entry into electronic source systems and directly from research subjects. For this manual entered source data, common methods of data cleaning such as post-entry identification and resolution of discrepancies and double data entry are not feasible. However data accuracy rates achieved without these mechanisms may be higher than desired for a particular research use. We evaluated a heuristic usability method for utility as a tool to independently and prospectively identify data collection form questions associated with data errors. The method evaluated had a promising sensitivity of 64% and a specificity of 67%. The method was used as described in the literature for usability with no further adaptations or specialization for predicting data errors. We conclude that usability evaluation methodology should be further investigated for use in data quality assurance.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Large field studies of travelers' diarrhea for multiple destinations are limited by the need to perform stool cultures on site in a timely manner. A method for the collection, transport, and storage of fecal specimens that does not require immediate processing and refrigeration and that is stable for months would be advantageous. This study was designed to determine if enterotoxigenic Escherichia coli (ETEC) and enteroaggregative E. coli (EAEC) DNA could be identified from cards that were processed for the evaluation of fecal occult blood. U.S. students traveling to Mexico during 2005 to 2007 were monitored for the occurrence of diarrheal illness. When ill, students provided a stool specimen for culture and occult blood by the standard methods. Cards then were stored at room temperature prior to DNA extraction. Fecal PCR was performed to identify ETEC and EAEC in DNA extracted from stools and from occult blood cards. Significantly more EAEC cases were identified by PCR that was performed on DNA that was extracted from cards (49%) or from frozen feces (40%) than from culture methods that used HEp-2 adherence assays (13%) (P < 0.001). Similarly, more ETEC cases were detected from card DNA (38%) than from fecal DNA (30%) or by culture that was followed by hybridization (10%) (P < 0.001). The sensitivity and specificity of the card test were 75 and 62%, respectively, compared to those for EAEC by culture and were 50 and 63%, respectively, compared to those for ETEC. DNA extracted from fecal cards that was used for the detection of occult blood is of use in identifying diarrheagenic E. coli.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

This study applies the multilevel analysis technique to longitudinal data of a large clinical trial. The technique accounts for the correlation at different levels when modeling repeated blood pressure measurements taken throughout the trial. This modeling allows for closer inspection of the remaining correlation and non-homogeneity of variance in the data. Three methods of modeling the correlation were compared. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

The relationship between degree of diastolic blood pressure (DBP) reduction and mortality was examined among hypertensives, ages 30-69, in the Hypertension Detection and Follow-up Program (HDFP). The HDFP was a multi-center community-based trial, which followed 10,940 hypertensive participants for five years. One-year survival was required for inclusion in this investigation since the one-year annual visit was the first occasion where change in blood pressure could be measured on all participants. During the subsequent four years of follow-up on 10,052 participants, 568 deaths occurred. For levels of change in DBP and for categories of variables related to mortality, the crude mortality rate was calculated. Time-dependent life tables were also calculated so as to utilize available blood pressure data over time. In addition, the Cox life table regression model, extended to take into account both time-constant and time-dependent covariates, was used to examine the relationship change in blood pressure over time and mortality.^ The results of the time-dependent life table and time-dependent Cox life table regression analyses supported the existence of a quadratic function which modeled the relationship between DBP reduction and mortality, even after adjusting for other risk factors. The minimum mortality hazard ratio, based on a particular model, occurred at a DBP reduction of 22.6 mm Hg (standard error = 10.6) in the whole population and 8.5 mm Hg (standard error = 4.6) in the baseline DBP stratum 90-104. After this reduction, there was a small increase in the risk of death. There was not evidence of the quadratic function after fitting the same model using systolic blood pressure. Methodologic issues involved in studying a particular degree of blood pressure reduction were considered. The confidence interval around the change corresponding to the minimum hazard ratio was wide and the obtained blood pressure level should not be interpreted as a goal for treatment. Blood pressure reduction was attributed, not only to pharmacologic therapy, but also to regression to the mean, and to other unknown factors unrelated to treatment. Therefore, the surprising results of this study do not provide direct implications for treatment, but strongly suggest replication in other populations. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Problems due to the lack of data standardization and data management have lead to work inefficiencies for the staff working with the vision data for the Lifetime Surveillance of Astronaut Health. Data has been collected over 50 years in a variety of manners and then entered into a software. The lack of communication between the electronic health record (EHR) form designer, epidemiologists, and optometrists has led to some level to confusion on the capability of the EHR system and how its forms can be designed to fit all the needs of the relevant parties. EHR form customizations or form redesigns were found to be critical for using NASA's EHR system in the most beneficial way for its patients, optometrists, and epidemiologists. In order to implement a protocol, data being collected was examined to find the differences in data collection methods. Changes were implemented through the establishment of a process improvement team (PIT). Based on the findings of the PIT, suggestions have been made to improve the current EHR system. If the suggestions are implemented correctly, this will not only improve efficiency of the staff at NASA and its contractors, but set guidelines for changes in other forms such as the vision exam forms. Because NASA is at the forefront of such research and health surveillance the impact of this management change could have a drastic improvement on the collection of and adaptability of the EHR. Accurate data collection from this 50+ year study is ongoing and is going to help current and future generations understand the implications of space flight on human health. It is imperative that the vast amount of information is documented correctly.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Clinical Research Data Quality Literature Review and Pooled Analysis We present a literature review and secondary analysis of data accuracy in clinical research and related secondary data uses. A total of 93 papers meeting our inclusion criteria were categorized according to the data processing methods. Quantitative data accuracy information was abstracted from the articles and pooled. Our analysis demonstrates that the accuracy associated with data processing methods varies widely, with error rates ranging from 2 errors per 10,000 files to 5019 errors per 10,000 fields. Medical record abstraction was associated with the highest error rates (70–5019 errors per 10,000 fields). Data entered and processed at healthcare facilities had comparable error rates to data processed at central data processing centers. Error rates for data processed with single entry in the presence of on-screen checks were comparable to double entered data. While data processing and cleaning methods may explain a significant amount of the variability in data accuracy, additional factors not resolvable here likely exist. Defining Data Quality for Clinical Research: A Concept Analysis Despite notable previous attempts by experts to define data quality, the concept remains ambiguous and subject to the vagaries of natural language. This current lack of clarity continues to hamper research related to data quality issues. We present a formal concept analysis of data quality, which builds on and synthesizes previously published work. We further posit that discipline-level specificity may be required to achieve the desired definitional clarity. To this end, we combine work from the clinical research domain with findings from the general data quality literature to produce a discipline-specific definition and operationalization for data quality in clinical research. While the results are helpful to clinical research, the methodology of concept analysis may be useful in other fields to clarify data quality attributes and to achieve operational definitions. Medical Record Abstractor’s Perceptions of Factors Impacting the Accuracy of Abstracted Data Medical record abstraction (MRA) is known to be a significant source of data errors in secondary data uses. Factors impacting the accuracy of abstracted data are not reported consistently in the literature. Two Delphi processes were conducted with experienced medical record abstractors to assess abstractor’s perceptions about the factors. The Delphi process identified 9 factors that were not found in the literature, and differed with the literature by 5 factors in the top 25%. The Delphi results refuted seven factors reported in the literature as impacting the quality of abstracted data. The results provide insight into and indicate content validity of a significant number of the factors reported in the literature. Further, the results indicate general consistency between the perceptions of clinical research medical record abstractors and registry and quality improvement abstractors. Distributed Cognition Artifacts on Clinical Research Data Collection Forms Medical record abstraction, a primary mode of data collection in secondary data use, is associated with high error rates. Distributed cognition in medical record abstraction has not been studied as a possible explanation for abstraction errors. We employed the theory of distributed representation and representational analysis to systematically evaluate cognitive demands in medical record abstraction and the extent of external cognitive support employed in a sample of clinical research data collection forms. We show that the cognitive load required for abstraction in 61% of the sampled data elements was high, exceedingly so in 9%. Further, the data collection forms did not support external cognition for the most complex data elements. High working memory demands are a possible explanation for the association of data errors with data elements requiring abstractor interpretation, comparison, mapping or calculation. The representational analysis used here can be used to identify data elements with high cognitive demands.

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Despite continued research and public health efforts to reduce smoking during pregnancy, prenatal cessation rates in the United States have decreased and the incidence of low birth weight has increased from 1985 to 1991. Lower socioeconomic status women who are at increased risk for poor pregnancy outcomes may be resistant to current intervention efforts during pregnancy. The purpose of this dissertation was to investigate the determinants of continued smoking and quitting among low-income pregnant women.^ Using data from cross-sectional surveys of 323 low-income pregnant smokers, the first study developed and tested measures of the pros and cons of smoking during pregnancy. The original decisional balance measure for smoking was compared with a new measure that added items thought to be more salient to the target population. Confirmatory factor analysis using structural equation modeling showed neither the original nor new measure fit the data adequately. Using behavioral science theory, content from interviews with the population, and statistical evidence, two 7-item scales representing the pros and cons were developed from a portion (n = 215) of the sample and successfully cross-validated on the remainder of the sample (n = 108). Logistic regression found only pros were significantly associated with continued smoking. In a discriminant function analysis, stage of change was significantly associated with pros and cons of smoking.^ The second study examined the structural relationships between psychosocial constructs representing some of the levels of and the pros and cons of smoking. The cross-sectional design mandates that statements made regarding prediction do not prove causation or directionality from the data or methods analysis. Structural equation modeling found the following: more stressors and family criticism were significantly more predictive of negative affect than social support; a bi-directional relationship was found between negative affect and current nicotine addiction; and negative affect, addiction, stressors, and family criticism were significant predictors of pros of smoking.^ The findings imply reversing the trend of decreasing smoking cessation during pregnancy may require supplementing current interventions for this population of pregnant smokers with programs addressing nicotine addiction, negative affect, and other psychosocial factors such as family functioning and stressors. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Social capital, a relatively new public health concept, represents the intangible resources embedded in social relationships that facilitate collective action. Current interest in the concept stems from empirical studies linking social capital with health outcomes. However, in order for social capital to function as a meaningful research variable, conceptual development aimed at refining the domains, attributes, and boundaries of the concept are needed. An existing framework of social capital (Uphoff, 2000), developed from studies in India, was selected for congruence with the inductive analysis of pilot data from a community that was unsuccessful at mobilizing collective action. This framework provided the underpinnings for a formal ethnographic research study designed to examine the components of social capital in a community that had successfully mobilized collective action. The specific aim of the ethnographic study was to examine the fittingness of Uphoff's framework in the contrasting American community. A contrasting context was purposefully selected to distinguish essential attributes of social capital from those that were specific to one community. Ethnographic data collection methods included participant observation, formal interviews, and public documents. Data was originally analyzed according to codes developed from Uphoff's theoretical framework. The results from this analysis were only partially satisfactory, indicating that the theoretical framework required refinement. The refinement of the coding system resulted in the emergence of an explanatory theory of social capital that was tested with the data collected from formal fieldwork. Although Uphoff's framework was useful, the refinement of the framework revealed, (1) trust as the dominant attribute of social capital, (2) efficacy of mutually beneficial collective action as the outcome indicator, (3) cognitive and structural domains more appropriately defined as the cultural norms of the community and group, and (4) a definition of social capital as the combination of the cognitive norms of the community and the structural norms of the group that are either constructive or destructive to the development of trust and the efficacy of mutually beneficial collective action. This explanatory framework holds increased pragmatic utility for public health practice and research. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Case control and retrospective studies have identified parental substance abuse as a risk factor for physical child abuse and neglect (Dore, Doris, & Wright, 1995, May; S. R. Dube et al., 2001; Guterman & Lee, 2005, May; Walsh, MacMillan, & Jamieson, 2003). The purpose of this paper is to present the findings of a systematic review of prospective studies from 1975 through 2005 that include parental substance abuse as a risk factor for physical child abuse or neglect. Characteristics of each study such as the research question, sample information, data collection methods and results, including the parent assessed and definitions of substance abuse and physical child abuse and neglect, are discussed. Five studies were identified that met the search criteria. Four of five studies found that parental substance abuse was a significant variable in predicting physical child abuse and neglect.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

With the recognition of the importance of evidence-based medicine, there is an emerging need for methods to systematically synthesize available data. Specifically, methods to provide accurate estimates of test characteristics for diagnostic tests are needed to help physicians make better clinical decisions. To provide more flexible approaches for meta-analysis of diagnostic tests, we developed three Bayesian generalized linear models. Two of these models, a bivariate normal and a binomial model, analyzed pairs of sensitivity and specificity values while incorporating the correlation between these two outcome variables. Noninformative independent uniform priors were used for the variance of sensitivity, specificity and correlation. We also applied an inverse Wishart prior to check the sensitivity of the results. The third model was a multinomial model where the test results were modeled as multinomial random variables. All three models can include specific imaging techniques as covariates in order to compare performance. Vague normal priors were assigned to the coefficients of the covariates. The computations were carried out using the 'Bayesian inference using Gibbs sampling' implementation of Markov chain Monte Carlo techniques. We investigated the properties of the three proposed models through extensive simulation studies. We also applied these models to a previously published meta-analysis dataset on cervical cancer as well as to an unpublished melanoma dataset. In general, our findings show that the point estimates of sensitivity and specificity were consistent among Bayesian and frequentist bivariate normal and binomial models. However, in the simulation studies, the estimates of the correlation coefficient from Bayesian bivariate models are not as good as those obtained from frequentist estimation regardless of which prior distribution was used for the covariance matrix. The Bayesian multinomial model consistently underestimated the sensitivity and specificity regardless of the sample size and correlation coefficient. In conclusion, the Bayesian bivariate binomial model provides the most flexible framework for future applications because of its following strengths: (1) it facilitates direct comparison between different tests; (2) it captures the variability in both sensitivity and specificity simultaneously as well as the intercorrelation between the two; and (3) it can be directly applied to sparse data without ad hoc correction. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Patients who had started HAART (Highly Active Anti-Retroviral Treatment) under previous aggressive DHHS guidelines (1997) underwent a life-long continuous HAART that was associated with many short term as well as long term complications. Many interventions attempted to reduce those complications including intermittent treatment also called pulse therapy. Many studies were done to study the determinants of rate of fall in CD4 count after interruption as this data would help guide treatment interruptions. The data set used here was a part of a cohort study taking place at the Johns Hopkins AIDS service since January 1984, in which the data were collected both prospectively and retrospectively. The patients in this data set consisted of 47 patients receiving via pulse therapy with the aim of reducing the long-term complications. ^ The aim of this project was to study the impact of virologic and immunologic factors on the rate of CD4 loss after treatment interruption. The exposure variables under investigation included CD4 cell count and viral load at treatment initiation. The rates of change of CD4 cell count after treatment interruption was estimated from observed data using advanced longitudinal data analysis methods (i.e., linear mixed model). Using random effects accounted for repeated measures of CD4 per person after treatment interruption. The regression coefficient estimates from the model was then used to produce subject specific rates of CD4 change accounting for group trends in change. The exposure variables of interest were age, race, and gender, CD4 cell counts and HIV RNA levels at HAART initiation. ^ The rate of fall of CD4 count did not depend on CD4 cell count or viral load at initiation of treatment. Thus these factors may not be used to determine who can have a chance of successful treatment interruption. CD4 and viral load were again studied by t-tests and ANOVA test after grouping based on medians and quartiles to see any difference in means of rate of CD4 fall after interruption. There was no significant difference between the groups suggesting that there was no association between rate of fall of CD4 after treatment interruption and above mentioned exposure variables. ^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Medication errors, one of the most frequent types of medical errors, are a common cause of patient harm in hospital systems today. Nurses at the bedside are in a position to encounter many of these errors since they are there at the start of the process (ordering/prescribing) and the end of the process (administration). One of the recommendations from the IOM (Institute of Medicine) report, "To Err is Human," was for organizations to identify and learn from medical errors through event reporting systems. While many organizations have reporting systems in place, research studies report a significant amount of underreporting by nurses. A systematic review of the literature was performed to identify contributing factors related to the reporting and not reporting of medication errors by nurses at the bedside.^ Articles included in the literature review were primary or secondary studies, dated January 1, 2000 – July 2009, related to nursing medication error reporting. All 634 articles were reviewed with an algorithm developed to standardize the review process and help filter out those that did not meet the study criteria. In addition, 142 article bibliographies were reviewed to find additional studies that were not found in the original literature search.^ After reviewing the 634 articles and the additional 108 articles discovered in the bibliography review, 41 articles met the study criteria and were used in the systematic literature review results.^ Fear of punitive reactions to medication errors was a frequent barrier to error reporting. Nurses fear reactions from their leadership, peers, patients and their families, nursing boards, and the media. Anonymous reporting systems and departments/organizations with a strong safety culture in place helped to encourage the reporting of medication errors by nursing staff.^ Many of the studies included in this literature review do not allow results that can be generalized. The majority of them took place in single institutions/organizations with limited sample sizes. Stronger studies with larger sample sizes need to be performed, utilizing data collection methods that have been validated, to determine stronger correlations between safety cultures and nurse error reporting.^

Relevância:

100.00% 100.00%

Publicador:

Resumo:

Objective: In this secondary data analysis, three statistical methodologies were implemented to handle cases with missing data in a motivational interviewing and feedback study. The aim was to evaluate the impact that these methodologies have on the data analysis. ^ Methods: We first evaluated whether the assumption of missing completely at random held for this study. We then proceeded to conduct a secondary data analysis using a mixed linear model to handle missing data with three methodologies (a) complete case analysis, (b) multiple imputation with explicit model containing outcome variables, time, and the interaction of time and treatment, and (c) multiple imputation with explicit model containing outcome variables, time, the interaction of time and treatment, and additional covariates (e.g., age, gender, smoke, years in school, marital status, housing, race/ethnicity, and if participants play on athletic team). Several comparisons were conducted including the following ones: 1) the motivation interviewing with feedback group (MIF) vs. the assessment only group (AO), the motivation interviewing group (MIO) vs. AO, and the intervention of the feedback only group (FBO) vs. AO, 2) MIF vs. FBO, and 3) MIF vs. MIO.^ Results: We first evaluated the patterns of missingness in this study, which indicated that about 13% of participants showed monotone missing patterns, and about 3.5% showed non-monotone missing patterns. Then we evaluated the assumption of missing completely at random by Little's missing completely at random (MCAR) test, in which the Chi-Square test statistic was 167.8 with 125 degrees of freedom, and its associated p-value was p=0.006, which indicated that the data could not be assumed to be missing completely at random. After that, we compared if the three different strategies reached the same results. For the comparison between MIF and AO as well as the comparison between MIF and FBO, only the multiple imputation with additional covariates by uncongenial and congenial models reached different results. For the comparison between MIF and MIO, all the methodologies for handling missing values obtained different results. ^ Discussions: The study indicated that, first, missingness was crucial in this study. Second, to understand the assumptions of the model was important since we could not identify if the data were missing at random or missing not at random. Therefore, future researches should focus on exploring more sensitivity analyses under missing not at random assumption.^

Relevância:

50.00% 50.00%

Publicador:

Resumo:

Brain tumor is one of the most aggressive types of cancer in humans, with an estimated median survival time of 12 months and only 4% of the patients surviving more than 5 years after disease diagnosis. Until recently, brain tumor prognosis has been based only on clinical information such as tumor grade and patient age, but there are reports indicating that molecular profiling of gliomas can reveal subgroups of patients with distinct survival rates. We hypothesize that coupling molecular profiling of brain tumors with clinical information might improve predictions of patient survival time and, consequently, better guide future treatment decisions. In order to evaluate this hypothesis, the general goal of this research is to build models for survival prediction of glioma patients using DNA molecular profiles (U133 Affymetrix gene expression microarrays) along with clinical information. First, a predictive Random Forest model is built for binary outcomes (i.e. short vs. long-term survival) and a small subset of genes whose expression values can be used to predict survival time is selected. Following, a new statistical methodology is developed for predicting time-to-death outcomes using Bayesian ensemble trees. Due to a large heterogeneity observed within prognostic classes obtained by the Random Forest model, prediction can be improved by relating time-to-death with gene expression profile directly. We propose a Bayesian ensemble model for survival prediction which is appropriate for high-dimensional data such as gene expression data. Our approach is based on the ensemble "sum-of-trees" model which is flexible to incorporate additive and interaction effects between genes. We specify a fully Bayesian hierarchical approach and illustrate our methodology for the CPH, Weibull, and AFT survival models. We overcome the lack of conjugacy using a latent variable formulation to model the covariate effects which decreases computation time for model fitting. Also, our proposed models provides a model-free way to select important predictive prognostic markers based on controlling false discovery rates. We compare the performance of our methods with baseline reference survival methods and apply our methodology to an unpublished data set of brain tumor survival times and gene expression data, selecting genes potentially related to the development of the disease under study. A closing discussion compares results obtained by Random Forest and Bayesian ensemble methods under the biological/clinical perspectives and highlights the statistical advantages and disadvantages of the new methodology in the context of DNA microarray data analysis.